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A generic framework for the solution of PDE-constrained optimisation problems based on the FEniCS sys- 
tem is presented. Its main features are an intuitive mathematical interface, a high degree of automation, 
and an efficient implementation of the generated adjoint model. The framework is based upon the extension 
of a domain-specific language for variational problems to cleanly express complex optimisation problems 
in a compact, high-level syntax. For example, optimisation problems constrained by the time-dependent 
Navier-Stokes equations can be written in tens of lines of code. Based on this high-level representation, 
the framework derives the associated adjoint equations in the same domain- specific language, and uses 
the FEniCS code generation technology to emit parallel optimised low-level C++ code for the solution of 
the forward and adjoint systems. The functional and gradient information so computed is then passed to 
the optimisation algorithm to update the parameter values. This approach works both for steady-state as 
well as transient, and for linear as well as nonlinear governing PDEs and a wide range of functionals and 
O control parameters. We demonstrate the applicability and efficiency of this approach on classical textbook 

optimisation problems and advanced examples. 
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1. INTRODUCTION 



> 

On 

QQ Optimisation problems constrained by partial differential equations (PDEs) are ubiq 

uitous across science and engineering. Such problems consist of optimising an objec 
^ tive functional, e.g. maximising the performance or minimising the cost of a system, 



subject to constraints given by the laws of physics [Lions 1971]: for example, an aero- 



^ nautical engineer will want to choose the best shape for a wing to optimise its perfor- 
^ 
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mance [Jameson 1988]. Inverse problems may also be treated as optimisation prob- 
lems, where the goal is to infer some unobservable state from observable evidence; 
this is achieved by adjusting the unknown state to minimise some misfit functional 
[Le Dimet and Talagrand 1986]. This approach is now fundamental to the geosciences: 
for example, it is routinely used in operational meteorology [Rabier et al. 2000]. 

Approximating the solution of PDEs is computationally expensive. This motivates 
the use of gradient-based optimisation algorithms, since exploring the control space 
without derivative information typically requires a prohibitive number of PDE evalu- 
ations for practical problems. The typical case for PDE-constrained optimisation prob- 
lems is that where the dimension of the control space is large, and where the number 
of functionals to control is small (usually one). Therefore, their efficient solution relies 
on the fast gradient computation for a small number of functionals with respect to 
many parameters. 

A naive approach to computing the gradient of a functional is to perturb each con- 
trol parameter in turn, and approximate the gradient using finite differences. A more 
sophisticated way would be to employ the tangent linear model associated with the 
forward PDE system, which circumvents the problems of roundoff errors by propagat- 
ing derivative information forward through the computational graph, from one input 
parameter through to all outputs. However, with both of these approaches, the num- 
ber of PDE solves required for a single gradient computation scales linearly with the 
number of parameters, making them infeasible for the typical case described above. 
By contrast, the adjoint method computes the gradient of a scalar functional with a 
single PDE solve, by propagating derivative information backwards through the com- 
putational graph, from the output functional back to all inputs [Giles and Pierce 2000; 
Griewank and Walther 2008]. The adjoint method is a key ingredient in making the 
large-scale solution of complex optimisation problems feasible. 

However, deriving and implementing the adjoint PDE model is typically regarded as 
difficult. This has been one of the main motivations for the development of algorithmic 
differentiation techniques (AD, also called automatic differentiation), which attempt to 
automate the adjoint model derivation. However, in practice the application of an AD 
tool typically requires large amounts of user intervention and expertise, in particular 
for advanced forward model implementations. Naumann [2012, pg. xii] states that "the 
automatic generation of optimal (in terms of robustness and efficiency) adjoint versions 
of large-scale simulation code is one of the great open challenges in the field of High- 
Performance Scientific Computing". Giles and Pierce [2000] observe that 

Considering the importance of design to .. all of engineering, it is perhaps 
surprising that the development of adjoint codes has not been more rapid .. 
[I]t seems likely that part of the reason is its complexity. 

In previous work, we have efficiently solved this adjoint derivation problem for the 
case where the forward problem may be discretised using the finite element method 
[Farrell et al. 2012]. The key contribution of this paper is to apply this advance to 
automate the solution of large classes of PDE-constrained optimisation problems. The 
main features of this new framework are: 

Usability. The user specifies the discretised optimisation problem in a form that 
resembles the mathematical notation. 

Automation. Based on this problem specification, the framework performs the nec- 
essary steps for the optimisation, without further user intervention. These steps 
include interfacing with an optimisation method, followed by repeated PDE solves 
for evaluating the objective functional and computing the functional gradient using 
the automatically derived adjoint system. 
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Generality. The framework handles large classes of governing PDEs, including cou- 
pled, nonlinear and time-dependent PDEs. Furthermore, the user may choose from 
multiple gradient-free and gradient-based optimisation algorithms. 
Performance. Optimisation algorithms typically require many iterations, each of 
which involve computationally expensive PDE solves. For many problems of practi- 
cal interest, efficient parallel PDE solvers are therefore essential to obtain reason- 
able run times. Significant work has been undertaken in order to produce efficient 
assembly code in FEniCS [Kirby and Logg 2007; Markall et al. 2012; 01gaard and 
Wells 2010]. Since the adjoint model relies on the same FEniCS code generation 
techniques as the forward model, the performance of the adjoint implementation 
inherits the same efficiency and parallel scaling of the forward model. As a con- 
sequence, the achieved adjoint efficiency ratios are often close to the optimal ratio 
between 1 and 2; by contrast, a general AD tool typically yields efficiency ratios in 
the range of 3 to 30 [Naumann 2012, pg. xi]. 

These features are achieved by exploiting the particular structure of finite element 
models (in contrast to traditional AD, which attempts to solve the general case). Finite 
element discretisations of the governing PDE have a natural domain-specific language, 
the language of variational forms, that abstractly captures the mathematical structure 
of the problem without any details of how its solution is to be achieved on a particu- 
lar platform. Instead of implementing the model in tens or hundreds of thousands of 
lines of a low-level language such as Fortran or C++, the user compactly describes 
the discretisation of the forward problem in the Unified Form Language (UFL) [Alnses 
2011; Alnaes et al. 2012] of the FEniCS project [Logg et al. 2012]. UFL mimics the 
mathematical notation almost exactly, and can express even complex time-dependent 
coupled nonlinear problems in tens of lines of code. 

The presented framework uses the high-level UFL representation of the forward 
model for two purposes. First, the forward code is generated using the FEniCS project 
in the usual way: the UFL is passed to a dedicated finite element compiler, which 
generates optimised low-level C++ code for its parallel implementation on a particular 
platform [Kirby and Logg 2006; 01gaard and Wells 2010; Markall et al. 2012]. Second, 
as each forward solve is executed at runtime, a symbolic tape of the forward model is 
recorded in UFL format. This tape is analogous to the concept of a tape in AD [Corliss 
and Griewank 1993], except that instead of recording individual floating-point oper- 
ations, the units on the tape are whole equation solves. Once the forward model has 
terminated, symbolic manipulation is applied to this tape to derive the UFL represen- 
tation of the associated adjoint problem, which in turn is passed to the same compiler 
to emit an efficient parallel implementation of the adjoint model [Farrell et al. 2012]. 

In this paper we discuss the extension of this system to compactly express and solve 
optimisation problems. The user describes the forward model, the control parame- 
ters and the objective functional in an extension of the UFL notation. The optimisa- 
tion framework then repeatedly re-executes the tape to evaluate the functional value, 
solves the adjoint PDE to compute the functional gradient, and modifies the tape to 
update the position in parameter space until an optimal solution is found. 

1.1. Related work 

One of the main motivations for this work is the fact that, despite the broad applica- 
bility of PDE-constrained optimisation, there exist few software packages that gather 
and unify the tools required to solve such problems. 

A closely related project is developed by van Bloemen Waanders et al. [2002], with 
the goal of creating an optimisation framework based on the finite-element software 
Sundance [Long et al. 2012]. Sundance is similar to FEniCS in that it also operates on 
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variational forms: in particular, it can automatically differentiate and adjoin individ- 
ual variational forms. However, the built-in automatic adjoint derivation of Sundance 
does not currently extend to cases where the forward model consists of a sequence of 
variational problems, which is typically the case for time-dependent problems. 

Other open source alternatives include DAKOTA [Eldred et al. 1996], the Stanford 
University Unstructured suite [Palacios et al. 2012] and RoDoBo [Becker et al. 2005]. 
The key difference between these and the framework presented here is that the dif- 
ficult step of the adjoint derivation and implementation has not been automated. In- 
stead, if a new PDE is to be solved the adjoint model must be derived and implemented, 
either manually or with the help of AD tools, both of which demand significant devel- 
opment effort and user expertise. 

Finally the PROPT [Rutquist and Edvall 2010], ACADO [Houska et al. 2011] and 
CasADi [Andersson et al. 2012] toolkits are optimisation frameworks with similar de- 
sign goals, but focussed on ordinary differential equations and differential-algebraic 
equations instead of PDEs. 

The paper is organized as follows: The next section states the general form of PDE- 
constrained optimisation problems and compares different solution techniques. Sec- 
tion 3 presents the newly developed framework in detail, followed by a code demonstra- 
tion in section 4. In section 5 the framework implementation is verified using textbook 
examples with known anal3^ical solutions. Finally, section 6 applies the optimisation 
framework to a wide range of problems before making some concluding remarks in 
section 7. 

2. THE FORMULATION OF PDE-CONSTRAINED OPTIMISATION PROBLEMS 

We consider the PDE-constrained optimisation problem in the following general form: 

min J{u, m) 

u,rn 

subject to F{u, m) — 0, 
h{m) = 0, 
giTn) < 0, 

where the vector m contains the optimisation parameters, F{u, m) = is a system of 
PDEs parameterised by m with solution vector u, and J{u, m) e M is the scalar-valued 
objective functional that is to be minimised. The equality and inequality constraints 
h{m) = and g{m) < enforce additional conditions on the optimisation parameters 
m. A common example is a box constraint of the form: 

a < m < b. 

State constraints are not directly considered in formulation (1). However, in section 6.1 
we show an example where a penalisation approach is employed to enforce state con- 
straints. 

Throughout this paper we assume that the above operators are sufficiently diflfer- 
entiable, and that the PDE has a unique solution for any m, i.e. there is a solution 
operator u{m) such that F{u{m),m) =0 V m. 

2.1. The optimality conditions and the reduced formulation 

The necessary first order optimality conditions for problem (1), also known as the 
Karush-Kuhn-Tucker (KKT) conditions [Karush 1939; Kuhn and Tucker 1951], are 
derived from the associated Lagrangian. Ignoring the control constraints for simplic- 
ity, the Lagrangian of (1) is: 

£{u, TO, A) EE J{u, to) + \^F{u, to) e M, 
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where A is the Lagrange multipHer (also known as the dual or adjoint variable). The 
KKT conditions state that for every local minimum (u, fh) at which some regularity 
conditions are satisfied (see Hinze et al. [2009] for details), there exists a Lagrange 
multiplier A such that: 
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Equation (2b) is referred to as the adjoint equation; (2c) recovers the governing PDE. 
Solving (2b) for A and substituting the result into the control equation (2a) yields that 
the total derivative of the objective functional vanishes at the optimal point. 

One approach to compute a local solution of problem (1), known as the all-at-once ap- 
proach, simultaneous analysis and design, or the oneshot approach, is to directly solve 
the KKT system (2). A common solution method is sequential quadratic programming 
(SQP), which for the simplified case considered here is equivalent to applying New- 
ton's method to the KKT system (2). A key advantage of SQP is that it inherits the fast 
local quadratic convergence to a local solution from Newton's method [Boggs and Tolle 
1995]. However, the KKT system (2) yields significant challenges for numerical solvers: 
it is a coupled, nonlinear and often ill-conditioned system of PDEs and the resulting 
linear systems that need to be solved for each SQP update are high-dimensional. This 
issue becomes particularly problematic in the case of time-dependent governing PDEs 
where the discrete solution vectors u and A contain the entire forward and adjoint 
solution trajectories over time and space. 

Since direct solver methods are typically not suitable to solve linear problems of 
such dimensions, iterative solvers in combination with advanced preconditioning tech- 
niques are often applied and show promising results in certain applications [Batter- 
mann and Sachs 2001; Biros and Ghattas 2000; Schoberl and Zulehner 2007]. An al- 
ternative approach is to use a space reduction method; here, the solution of the full 
linear systems is avoided by performing a block-LU decomposition [Byrd and Nocedal 
1990; Biegler et al. 1995; Schulz 1998]. As a result, the system is uncoupled and can 
be solved in separate steps of more manageable size. Reduced SQP methods have been 
successfully applied to various applications [Kupfer and Sachs 1992; Orozco and Ghat- 
tas 1992; Orozco and Ghattas 1997]. 

The all-at-once approach has the disadvantage that the current estimate of u, m 
and A must be stored at any time. For large, time-dependent problems, storing the 
whole estimate of the forward and adjoint solutions u and A can quickly exceed the 
available memory: for example, a simulation with 10^ spatial degrees of freedom and 
10^ timesteps would require roughly 1000 TB of memory in double point precision, 
exceeding the memory capacities of most available computers. 

This issue can be circumvented by performing a space reduction on the original op- 
timisation problem (1). This approach (also known as black-box or nested analysis 
and design approach) replaces the objective functional J(u, m) with the reduced func- 
tional J(m) = J(u(m), m), that is the functional is considered as a pure function of the 
optimisation parameters. Since the reduced functional implicitly enforces the solution 
of the PDE, the PDE -constraint becomes superficial in the optimisation formulation. 
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The result is the following reduced optimisation problem: 

min J(m) 

subject to h{m) = 0, 
g{m) < 0. 

This formulation has the practical advantage that the dimension of the optimisa- 
tion problem is greatly reduced, since the dimension of m is typically much smaller 
than that of the PDE solution u. Consequently, many robust and established opti- 
misation methods are directly applicable. Furthermore, the storage requirement is 
significantly lowered; firstly because the adjoint solution is only used for computing 
the functional gradient and does not need to be saved and secondly because the stor- 
age of the entire forward solution trajectory may be avoided by using a checkpointing 
strategy to balance storage and computation cost. This allows the solution of large 
scale optimisation problems for which storing the whole forward solution would be im- 
possible [Griewank 1992]. In the example above, the optimal checkpointing scheme 
implemented by Griewank and Walther [2000] with 1000 checkpoints reduces the stor- 
age cost to approximately 10 GB while the computational cost approximately doubles. 
Another advantage of the reduced formulation is that the governing PDE is always sat- 
isfied after each optimisation iteration. Hence, the optimisation loop may be stopped 
as soon as the functional is sufficiently reduced, simplifying the formulation of termi- 
nation criteria. 

Long et al. [2012] state that for steady problems, the all-at-once can outperform the 
reduced formulation by a wide margin, but that for time-dependent systems, the re- 
duced formulation is often preferable. Since the framework supports time-dependent 
problems, the current implementation solves the optimisation problem in the reduced 
formulation (3). However, in principle all components for solving the all-at-once ap- 
proach are available, and we plan to implement this approach in future work. Fur- 
thermore, the reduced formulation is often used to precondition the all-at-once ap- 
proach [Biros and Ghattas 2000]. 

The following example illustrates the two formulations (1) and (3) on a classical op- 
timal control problem (see for example Troltzsch [2005, chapter 2.1.5] or Hinze et al. 
[2009, chapter 1.5.3]): given a thin, beatable plate c with fixed temperature at 
the domain boundary dfl, what is the optimal heating function that should be applied 
to obtain a desired temperature profile? This problem can be formulated as an optimi- 
sation problem constrained by the stationary heat equation: 



min Urf||^2(o) + iTNIIi2(a) 

subject to - V^ii = m on il, (4) 
u = on 9f2, 

a < ni < b onft. 

Here : f2 — M is the desired temperature profile and u : — s> M is the solution 
of the stationary heat equation with homogeneous Dirichlet boundary conditions. The 
objective functional measures the misfit between the PDE solution and the desired 
temperature profile plus a regularisation term that is multiplied by a scaling factor 
a > 0. The optimisation parameter m : Q R controls the heat source and is limited 
by the box constraints. 

Under mild assumptions, the heat equation with Dirichlet boundary conditions 
yields a unique solution for any source parameter m [Hinze et al. 2009, §1.3.1.1]. Hence 
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there exists a solution operator u{m) and the reduced problem may be formulated: 

min ^\\u{m) - Ud||i2(o) + |Nlli2(n) 
subject to a < m < 6 onfl. 
This problem will be used for the code example in section 4. 
3. THE OPTIMISATION FRAMEWORK 

The core of the framework relies on two software components: first, the FEniCS sys- 
tem [Logg et al. 2012; Logg 2007] is used to solve the forward and adjoint PDEs. Sec- 
ond, it relies on libadjoint and dolfin-adjoint [Farrell et al. 2012] to automatically de- 
rive the associated adjoint system for gradient computations. In this work we have 
extended the dolfin-adjoint framework to go beyond adjoint derivation, to automate 
the solution of PDE-constrained optimisation problems. 

The following sections discuss these components in detail. The source code, includ- 
ing the examples and applications in the following sections, is available at http: 
//dolf in- ad j oint . org. 

3.1. The FEnlCS system 

The FEniCS system is a collection of software components for automating the solution 
of PDEs by the finite element method. This section gives a brief introduction to the 
FEniCS system. A thorough overview can be found in Logg et al. [2012]. 

To solve a PDF with the FEniCS system, the user defines its discretised weak form 
in the domain specific language UFL that mimics and encodes the mathematical for- 
mulation [Alnses 2011; Alnses et al. 2012]. This high-level formulation is then passed 
to a finite element form compiler such as FFC [Kirby and Logg 2006], which generates 
optimised low-level code for the evaluation of the local element tensors. This generated 
code is used by DOLFIN [Logg and Wells 2010; Logg et al. 2011] to globally assemble 
and solve the problem. DOLFIN also provides the data structures for meshes, function 
spaces and functions. 

For time-dependent PDEs, the temporal discretisation is usually performed with a 
non-finite element discretisation scheme. In this case, the user writes the time loop 
manually and solves the variational problem for each timelevel as described above. 

DOLFIN has interfaces for both C++ and P3^hon. The Python interface uses just in 
time compilation, i.e. it invokes the necessary compilers at runtime. In contrast, the 
code generation for the C++ interface happens at a preprocessing step before running 
the forward model. As a consequence, the high-level description of the forward problem 
is not directly available at runtime in the C++ interface. Because dolfin-adjoint relies 
on this data to perform runtime inspection and manipulation, the remaining sections 
discuss only the P3^hon interface to DOLFIN. 

3.2. libadjoint and dolfln-adjoint 

The libraries libadjoint and dolfin-adjoint enable the automatic derivation and solution 
of tangent linear and adjoint models from forward models written in DOLFIN. 

The purpose of libadjoint is to facilitate the development of tangent linear and ad- 
joint models based on the fundamental abstraction of considering the forward model as 
a sequence of equation solves. Based on this abstraction, the library builds a symbolic 
description of the forward model, the tape, from which it can automatically derive the 
symbolic representation of the associated tangent linear and adjoint systems. 

The software library dolfin-adjoint acts as the between DOLFIN and libadjoint. It 
inspects DOLFIN's problem description at runtime, and performs the required tasks 
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discrete forward equations [UFL] 



libadjoint, dolfin-adjoint 



discrete adjoint equations [UFL] 



FEniCS system^ 



FEniCS system 



forward model [code] 



adjoint model [code] 



Fig. 1: The automated generation of the adjoint model with dolfin-adjoint. The user 
specifies the discrete forward equations in the high-level UFL language. From that 
symbolic representation of the problem, libadjoint and dolfin-adjoint can derive the 
corresponding representation of the discrete adjoint equations in UFL. Both the for- 
ward and adjoint equations are passed to the FEniCS system to generate the forward 
and adjoint model implementations. 



for applying libadjoint without user intervention. The tangent linear and adjoint mod- 
els produced with libadjoint are represented in the same high-level data format as the 
forward model. Therefore, the code generation techniques in FEniCS can be applied 
to the adjoint model in the same manner as the forward model, see figure 1. Farrell 
et al. [2012] showed that this approach leads to a robust, automatic and efficient way 
of implementing tangent linear and adjoint models. 

3.3. The optimisation frameworl< 

3.3.7. Introduction. The proposed framework extends dolfin-adjoint to solve PDE- 
constrained optimisation problems. The optimisation process consists of iteratively 
evaluating the functional of interest and its gradient at different points in the param- 
eter space. The key idea is to automate these evaluations by operating purely on the 
tape of the forward model recorded by libadjoint. In particular, a functional evaluation 
is obtained by replaying the tape (which runs the forward model) while simultaneously 
evaluating the objective functional. The functional gradient is computed by deriving 
and solving the adjoint model, as described in the previous section. When the optimisa- 
tion algorithm updates the point in parameter space, the tape is modified accordingly. 

With this approach, the only inputs required from the user are the objective func- 
tional, the control parameters and the governing PDEs, optionally with additional 
equality and inequality constraints. 

3.3.2. User interface. The solver for an optimisation problem is typically implemented 
in the following three steps. 

First, the user implements the governing PDF in the P3^hon interface of 
DOLFIN [Logg et al. 2012]. DOLFIN supports linear and nonlinear as well as steady 
and transient PDEs, and has been used to solve complex coupled systems such as 
the Reynolds-averaged Navier-Stokes equations for turbulent flows [Mortensen et al. 
2011], the Stokes equations for mantle convection [Vynnytska et al. 2013], and the 
Cahn-Hilliard equations for phase separation [Wells et al. 2006]. 

Second, the user defines the objective functional and the optimisation parameters. 
For that, a Functional class has been introduced that extends DOLFIN to support the 
expression of time-dependent functionals. For example, an objective functional com- 
puting: 

/ / u • vdxdt 
Jo Jn 

is defined by: 
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Alternatively, the functional can be evaluated at a specific time. For example 

/ u{t = 1) • v{t = l)da;, 
Jn 

is implemented by: 

J = Funct ional( inner (u, v)*dx*dt[l]) 

Functionals may consist of forms integrated over time, forms evaluated at a particular 
time, and sums of these. This allows the construction of complex objective function- 
als, e.g. a data assimilation problem with a regularisation term involving the initial 
condition might use the following objective functional: 

/ \\u - Uobslli2(o) dt + \u{t = 0)|^i(f2). 



The implementation of that functional is: 

J = Functional (inner (u - u_obs, u - u_obs)*dx*dt 
+ inner (grad (u) , grad (u) ) *dx*dt [0] ) 

Similarly, the user specifies the optimisation parameter m. For example, the following 
code defines the optimisation parameter to be the initial value of u: 

m = InitialConditionParsmieter (u) 

If a scalar value s is to be optimised, one may use 
m = ScalarPareuneter (s) 



In the final step, the user defines the reduced functional J with 
J_liat = ReducedFunctional ( J , m) 

In order to optimise for multiple parameters simultaneously, the user can optionally 
pass a list of Parameter objects. 

At this point, the reduced functional object J Jiat can be used to solve the associated 
minimisation problem min„i J(m) by calling: 

m_opt = minimize (J_hat) 

or the associated maximisation problem max™ J(m) by calling: 
m_opt = maximize (J_hat) 

Both of these functions solve the optimisation problem with the default settings and 
return the optimised parameters when finished. Additional arguments may be used to 
set and configure the optimisation algorithm and to define box, inequality or equality 
constraints, e.g.: 

m_opt = maximize (J_hat , bounds = (u_lower, u_upper) , method= ' SLSQP ' ) 

Currently, the framework supports most of the algorithms in the optimisation pack- 
age of SciPy [Jones et al. 2001]. For problems without additional constraints these 
are: 
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— the Nelder-Mead method [Nelder and Mead 1965]; 

— a modification to Powell's method [Powell 1964]; 

— a nonlinear conjugate gradient method [Nocedal and Wright 2006, §5.2]; 

— the BFGS method [Nocedal and Wright 2006, §6.1]; 

— the Newton-CG method [Nocedal and Wright 2006, §7.1] 

— and simulated annealing [Laarhoven and Aarts 1987]. 

Both Powell's method and simulated annealing are gradient-free optimisation algo- 
rithms. For problems with box, inequality or equality constraints, the user has the 
choice between: 

— sequential quadratic programming [Kraft 1994]; 

— the gradient-free Constrained Optimization by Linear Approximation method [Pow- 
ell 1994]. 

Finally, if only box-constraints are present, the 

— L-BFGS-B method [Nocedal and Wright 2006, §7.2]; 

— a Newton-CG implementation that supports box constraints [Nash 1984], 

may be used. 

The user also has access to more advanced features, such as to automatically ver- 
ify each gradient computation with the Taylor remainder convergence test during the 
optimisation procedure, or to execute user-supplied code after each gradient computa- 
tion, for example to create convergence plots. 

3.3.3. Implementation. The ReducedFunctional class provides two main functionalities: 
the evaluation of the objective functional for a given parameter value, and the compu- 
tation of the functional gradient. 

The functional evaluation is implemented by solving the forward equations that 
have been stored on libadjoint's tape and simultaneously computing the objective func- 
tional. However, the tape contains the parameter values of the initial forward run and 
a naive replay would reevaluate the forward model with the original parameters. This 
issue is resolved by first modifying the tape so that it reflects the most recent parame- 
ter values before executing the functional evaluation. 

The implementation of the gradient computation relies on the adjoint derivation of 
libadjoint and dolfin-adjoint to compute the gradient with the adjoint approach. Op- 
tionally, the user may use a checkpointing scheme to balance the storage and recom- 
putation cost of the gradient computation [Griewank 1992; Farrell et al. 2012]. 

The minimize and maximize routines implement the interface to the optimisation al- 
gorithms. Optimisation methods typically require the implementation of the functional 
evaluation and its gradient. The minimize and maximize routines generate these func- 
tions using the ReducedFunctional object. This involves the conversion of DOLFIN 
data structures to generic array data types on which the optimisation algorithms op- 
erate. 

Parallel execution is often crucial in PDE-constrained optimisation to achieve rea- 
sonable run times. While DOLFIN supports the parallel solution of the forward and 
adjoint models, the considered implementations of the optimisation algorithms are not 
designed for distributed execution. The minimize and maximize routines circumvent 
this problem by executing the PDE solves in parallel, but replicating the optimisation 
computation on each processor. For that, the minimize and maximize functions serialise 
the distributed data structures of the optimisation parameters, the functional gradi- 
ent and the constraints, as these are used by the optimisation algorithm. All other 
variables, in particular the forward and adjoint solutions, remain distributed and are 
not communicated. This approach works well for small-scale optimisation problems. 
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where the communication time for gathering the data and the execution time spent in 
the optimisation algorithm is small compared to the runtime of the PDE solves. For 
large-scale optimisation problems one should consider interfacing a parallel optimisa- 
tion algorithm, such as TAO [Munson et al. 2012] or OPT++ [Meza 1994]. 

Finally, there are cases where optimisation based purely on libadjoint's tape is not 
desired. For example, a forward model with an adaptive timestepping scheme changes 
the timestep according to certain conditions. This adaptivity is not reflected in the 
tape, and hence the optimisation process would only use the timestep choice that was 
used to build the tape. In such cases, the user can manually implement the functional 
evaluation; however, this approach has the disadvantage that the computed gradients 
become inconsistent with the discrete forward model, which can result in a reduced 
convergence of the optimisation algorithm. 

3.4. Restrictions 

The first restrictions are those associated with the adjoint computation [Farrell et al. 
2012, §5.4]. In particular, the automated adjoint derivation relies on the differentiabil- 
ity of the forward model, and that the forward model is implemented entirely in the 
Python interface to DOLFIN. 

State constraints other than the PDE constraint are not handled automatically. 
However, the framework does provide many of the tools required to implement the 
solution of such problems. An example is presented in section 6.1, where a problem 
with a variational inequality is broken down into a sequence of PDE-constrained opti- 
misation problems, each of which is solved with the framework presented here. 

Another restriction is that shape optimisation is not yet automated. It is possible 
to apply the shape calculus approach [Schmidt and Schulz 2010; Schmidt et al. 2011] 
which derives an expression for the shape gradient of the functional in terms of the 
forward and adjoint solutions, which may then be computed using the automatically 
generated adjoint model. In future work, we plan to automate this shape calculus anal- 
ysis to extend the framework to support automated shape optimisation. 

4. EXAMPLE CODE 

This section demonstrates the application of the presented optimisation framework 
to two optimal control problems. The first example solves the optimal heating prob- 
lem (4). The governing PDE in this case is the stationary heat equation. The second 
example replaces the stationary heat equation with a time-dependent, nonlinear PDE. 
Although this problem adds significant complexity to the forward and adjoint PDEs, 
the required code changes are minimal. 

4.1. Distributed control of the heat equation 

The first example solves the optimal control problem of the stationary heat equa- 
tion (4). The problem possesses a unique optimal solution if the box constraints a and 
b are bounded [Troltzsch 2005, chapter 2.5.1]. Otherwise the regularisation term must 
be strictly positive, i.e. a > 0, to ensure uniqueness. In the following example, these 
conditions are satisfied by choosing a = 0, a = and b = 0.5. 

To begin with, the user implements and solves the forward problem. By doing so, the 
optimisation framework creates the tape of the forward model. This tape is used by 
the optimisation framework to compute all future functional evaluations and gradient 
computations. 

For this example the two-dimensional domain Q = [—1,1]^ was uniformly discretised 
using triangular elements. The finite-dimensional function spaces are constructed us- 
ing PlcG elements for the PDE solution u and the desired temperature profile Ud, and 
POdg elements for the heat source m. The latter choice is motivated by the fact that 
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the control can possess discontinuities for a = 0. The following Python code initialises 
the domain, the function spaces and all required functions in DOLFIN: 



# Define the domain [-1, 1] x [-1, 1] 
n = 200 

mesh = RectcLngleMesh(-l , -1, 1, 1, n, n) 

# Define the function spaces 

V = FunctionSpace (mesh, "CG", degree = 1) 
W = FunctionSpace (mesh, "DG", degree = 0) 

# Define the functions 

u = Function(V, name = "Solution") 
m = Function(W, name = "Control") 

V = TestFunction(V) 



The weak form of the heat equation is obtained by multiplying the PDE with a test 
function v e V, then integrating the result over the domain and integrating by parts. 
The resulting weak formulation is: find u <eV such that: 

The associated code resembles the mathematical notation closely: 



# Solve the forward model to create the tape 
F = ( inner (grad (u) , grad(v)) - m*v)*dx 
be = DirichletBC(V, 0.0, "on_boundary") 
solve(F == 0, u, be) 



Next, the objective functional is defined. In this example, the desired temperature 
profile Ud is defined as: 

«,(x,y) = e-i/(i-^)-V(i-.^), 

plotted in figure 2a. The relevant code is: 

# Define the functional of interest 
u_d = exp(-l/(l-x[0]*x[0])-l/(l-x[l]*x[l])) 
J = Functional ((0.5*inner(u-u_d, u-u_d))*dx) 



At this point, the optimal control problem can be solved: 



# Define the reduced functional 

J_hat = ReducedFunctional(J, SteadyParameter (m) ) 

# Solve the optimisation problem 

m_opt = minimize (J_hat, method = "L-BFGS-B", bounds = (0.0, 0.5), 
options = {"gtol": le-16, "ftol": le-16» 

The last parameter of minimize sets the termination condition to stop if either the 
infinity norm of the projected gradient or the relative change of the functional value 
drops below the specified tolerance. 

The optimisation algorithm starts with a zero estimate for the control, i.e. m^^^ = 0. 
The corresponding functional value is J (m^"^) = 8.9 x 10^"'. The results of the optimi- 
sation are shown in figure 2. The final value of the objective functional is 1.4 x 10~^. 
The desired and optimised temperature profile are of similar shape, but their maxi- 
mum values differ significantly. This is reflected in the fact that the box constraints 
are active in large parts of the domain (figure 2d). 
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Fig. 2: The solutions of the stationary optimal heating problem with box constraints. 
The heat source control is limited by the box constraints in large parts of the domain, 
which leads to a relatively large difference between the desired and optimised temper- 
ature profiles. 

By removing the box constraints, a better agreement between the desired and op- 
timised temperature profiles is expected. Therefore the box constraints were removed 
and the regularisation parameter set to a = 10~^, in order to ensure the uniqueness of 
the optimal solution. The results are shown in figure 3. Compared to the previous re- 
sults, the pointwise difference between the desired and optimised temperature profiles 
is significantly decreased, yielding a final functional is 1.6 x 10^''. 

4.2. Distributed control of a nonlinear, time-dependent PDE 

This example modifies the optimal heating problem by replacing the stationary heat 
equation with a time-dependent, nonlinear PDE: 

min ^||u(t = r)-Wd||i2(n) ' 
du 

subject to 



dt 
-It = 
-It = 
a < m < b 



on fix (0,T], 

ondn X (0,T], 
for n X {0}, 
ondn X (o,r], 



(5) 



where m : x (0, T] — > M and : x (0, T] — > M are the time-dependent PDE solution 
and desired temperature profile, respectively, and the control m : — > M is constant 
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(c) Difference between optimal and desired (d) Optimised heat source control m 
temperature profiles u ~ 

Fig. 3: The solutions of the stationary optimal heating problem without box con- 
straints. The optimised heat source control achieves a good agreement between the 
desired and optimised temperature profiles. 



in time. The nonlinearity and time-dependency of the new governing PDE adds signif- 
icant complexity to the solution process of the optimisation algorithm. In particular, 
the gradient computations involve the storage of the forward solution trajectory (or 
the application of some checkpointing scheme) and the solution of the associated time- 
dependent adjoint system. 

However, since the steps for recording the tape and the functional and gradient eval- 
uations are automated, the code from the previous example can be reused almost en- 
tirely to solve problem (5). The only modification required is to replace the weak formu- 
lation of the heat equation with the new governing PDE. The following Python code 
shows an example implementation with a backward Euler time discretisation and a 
Newton solver for the nonlinear equation solve at each timestep: 



# Define the weak form 
timestep = 0.01 

F = ((u - u_old.) *v/timestep + inner (grad (u) , grad(v)) + u**3*v - m*v)*dx 

# Perform 10 timesteps 
for t in rangedl): 

solve (F == 0, u, be) 
u_old. assign(u) 
adj _inc_t imestep ( ) 
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3.70 X 10- 


2 


1.34 
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2.08 


3.12 X 10" 


2 
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2 


1.11 


2.20 X 10- 


-5 


2.03 


1.56 X 10" 


2 


8.33 X 10- 


3 


1.04 


5.43 X 10- 


-6 


2.02 


7.81 X 10- 


3 


4.11 X 10- 


3 


1.02 


1.36 X 10- 


-6 


2.00 



Table I: The rate of convergence for the smooth control test. The control error shows 
the expected first order convergence, and the PDE solution converges as expected at 
second order. 

The optimisation framework can also use a checkpointing scheme to reduce the 
storage cost of the gradient computations. For example, the multistage checkpointing 
scheme developed by Stumm and Walther [2009], which supports storing checkpoints 
both in RAM and on disk, is activated with: 

adj_checkpointing(strategy='multistage' , steps=ll, 
snaps_on_disk=2 , snaps_iii_ram=3) 



5. VERIFICATION 

This section applies the optimisation framework to problems with analytical solutions 
in order to compare the numerical order of convergence with the theoretical expecta- 
tion. An agreement of the convergence rates is considered to be a strong indicator that 
the implementation is correct [Salari and Knupp 2000]. 

The considered analytical solutions are based on the optimal control problem (4) of 
the heat equation, extended with an additional source term s : — > M: 

min \\u-Ud\\l2m) 

subject to — V^M = m + s 

u = 

- 1 < TO < 1 

The following sections perform the two convergence tests, the first one with a contin- 
uous optimal control function, and the second one with a discontinuous optimal control 
function. 

5.1. Smooth control 

The first test is based on an analytical solution with a smooth optimal control function: 

TO°P*(a;, y) = sin (nx) sin (ttj/) , 

=u°/\x,y) = ^sin(7rx) sin(7ry), 
s = 0. 

It is easy to see that this choice forms an optimal solution to problem (6). 

The discretisation and optimisation parameters are configured identically to the 
setup described in section 4.1. Then the convergence test was performed on five uni- 
formly discretised meshes with decreasing mesh element sizes. The resulting errors 
and convergence rates are given in table I. The first-order convergence for the control 
solution TO is expected as the underlying function space is discretised with POdg fi- 
nite elements. Similarly, a second-order rate of convergence is observed for the PDE 
solution u, as it is discretised with PIcg finite elements. 
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Element size 


j|m-m°P' 


1- 


order 


||li-U°P' 




order 


3.13 X 10"^ 


4.76 X 10- 






7.38 X 10- 


-4 
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2.41 X 10- 
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0.99 
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1.00 


5.04 X 10" 


-5 


1.99 


3.91 X 10-^ 


5.30 X 10- 


3 


1.18 


1.24 X 10" 
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3 


1.20 
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6 


2.29 



Table II: The rate of convergence for the bang-bang control test. The control error 
shows the expected first order convergence, and the PDE solution converges at second 
order as expected. 

5.2. Bang-bang control 

The second verification test is motivated by the fact that box constraints can lead to 
optimal control solutions with discontinuities. The following test, derived in Troltzsch 
[2005, chapter 2.9.1], yields an optimal control function with discontinuities in a 
chessboard-like shape: 

m°^*{x, y) = —sign (— sin (87ra;) sin (Sttj/)) . 

This kind of control, where the control values jump from one box constraint limit to 
the other, is also known as bang-bang control. The optimal state solution is chosen to 
be: 

y) = sin(7ra;) sin(7r?/). 

By applying the optimality conditions, the source term s and the desired PDE solution 
Ud are obtained: 

s{x^ y) = sin(7ra;) sin(7rj/) + sign (— sin (87ra;) sin (Sttj/)) , 

and: 

Ud(x^ y) = sin(7ra:) sin(7rj/) + sign (— sin (87ra::) sin {Si:y)) . 

The convergence test is performed with the same configuration as in the previous 
test. The resulting errors and convergence rates are given in table II. It shows the 
expected first-order convergence for the control and second-order convergence for the 
state solution, indicating that the optimisation implementation is correct. 

6. APPLICATIONS 

6.1. Optimal control governed by an elliptic variational inequality 

In this section we apply the framework to an optimal control problem investigated in 
Hintermiiller and Kopacka [2011, §5.2]. This problem involves a variational inequality 
constraint on the state, and is an example of a mathematical program with equilibrium 
constraints (MPEC). Let K = {v ^ H^{^) : w > 0}. The problem is stated as: 

1 2^2 

min J{u,m) = - \\u-Ud\\L2^n) + ^ W^Wl^u) (^a) 

subject to u e K, (7b) 

(Vm, V(w - u))^ > {f + m,v - u)^ y V e K, (7c) 
a < m < b a.e. in fi, (7d) 

where u : — > is the state variable, it^ : — > is a prescribed state to be matched, 
m : — > M is the control variable to be determined, / : — > M is a prescribed source 
term, a e M and e M are the lower and upper bounds on the control, and e M is a 
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Fig. 4: The feasibility violation of the state solution u as a function of a. As a ap- 
proaches 0, the variational inequality m > a.e. in O c is enforced. 

regularisation parameter. Note that the inequality u > a.e. is implied hy u £ K. Fol- 
lowing the penalisation approach [Tremolieres et al. 1981; Hintermiiller and Kopacka 
2011], the variational inequality can be approximated by the penalised equation: 



where a > is the penalty parameter. It is well known that the solution Ua of (8) 
converges to that of the variational inequality (7c) as a | 0. As the max operator is not 
diflferentiable, it is regularised in turn with a smoothing parameter e > (the "global" 
regularisation of Hintermiiller and Kopacka [2011, equation (2.4)]). Therefore, to solve 
(7), a sequence of problems are solved where the variational inequality (7c) is replaced 
with the regularised penalised equality constraint (8), and the penalisation parameter 
a is driven to zero. The solution of one iteration is used as the initial guess for the next. 

The PDE constraint is discretised using linear finite elements for the state and con- 
trol. The PDE is first solved once with a zero control m to build a tape of the forward 
model, and then all subsequent steps are performed by operating on this tape. The 
value of e is set to 10^'', while ly is set to 10^^. The remaining parameters are taken 
from Hintermiiller and Kopacka [2011, §5.2]). The value of a is initialised to lO^^^ and 
halved at each penalisation iteration. The penalised subproblem is solved with a call 
to minimize, which applies a limited-memory BFGS algorithm with control bounds 
support [Zhu et al. 1997]. The entire program consists of less than 50 lines of code. 

The feasibility of the state is measured by computing the diagnostic 
||max (0, — u)||^2(Q); figure 4 shows its evolution as a function of a. The control 
and state solutions of the optimal control problem are shown in figure 5. Excellent 
agreement is found with the solutions of Hintermiiller and Kopacka [2011, figures 5 
and 6], giving confidence that the solutions are correct. 



(Vu,Vu) + -{~ma.x{0,-u),v) = {f + m,v) Vu G H^{^), 



(8) 



a 
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Runtime (s) 


Ratio 


Forward model 


69.08 




Forward model + adjoint model 


77.85 


1.126 



Table III: Timings for the MPEC gradient calculation. The efficiency of the adjoint 
approaches the theoretical ideal value of 1.125. 



Finally, the efficiency of the gradient calculation V J is benchmarked. For gradient- 
based optimisation algorithms to be practical, the computation of the gradient must 
be affordable. To investigate the performance of the adjoint-based gradient calculation, 
one execution of the forward and adjoint models was timed; this was repeated several 
times to ensure the results were representative. The results are shown in table III. 
As the forward model in this case takes eight Newton iterations to converge, the ad- 
joint should take approximately one eighth the cost of the forward model, for an ideal 
R = (forward + adjoint) /(forward) ratio of 1.125. The observed ratio is 1.126, demonstrat- 
ing the claim in Farrell et al. [2012] that the gradient calculation with dolfin-adjoint 
approaches optimal theoretical efficiency. 



6.2. Optimal placement of tidal turbines 

This application investigates an essential problem in the tidal energy industry. The 
core idea is to place turbines in the ocean to extract the kinetic energy of the tidal 
flow and convert it into electricity. In order to extract an economically useful amount 
of power, many turbines (possibly hundreds) must be deployed in the tidal stream. The 
question is: how should these turbines be placed in relation to each other to maximise 
the power extracted? The strong nonlinear interaction between the turbines, the com- 
plicated constraints on the configuration (legal site restrictions, bounds on the gradient 
of the seafloor), and the sensitive dependence of the power on the configuration make 
it difficult to manually identify an optimal configuration. 

This problem is formulated as an optimisation problem constrained by the station- 
ary, nonlinear shallow water equations with appropriate initial and boundary condi- 
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Fig. 6: The solution of the optimal turbine layout problem. 

max J{u, m) 

u.rn 

subject to u ■ Vu - vV^u + gV?/ — ^i^) \\u\\u on O, (g) 
V • (Hu) = onn, 

where c is the computational domain, the unknowns u : fl ^ R"^ and rj : fl ^ R 
are the depth-averaged velocity and the free-surface displacement, respectively, H eR 
is the water depth at rest, g e M is the gravitational force, e M is the viscosity coef- 
ficient, and Cf, G M and ct represent the quadratic bottom friction and the turbine pa- 
rameterisation, respectively. In a practical application, problem formulation (9) should 
of course be extended to the time-dependent shallow water equations to account for 
the flood and ebb tides. 

The functional of interest J is defined to be the power extracted due to the increased 
friction in the turbine farm [Ben Elghali et al. 2007; Divett et al. 2011]: 

Jiu,m) = 11 pctim)\\ufdfl, (10) 

where p is the density of water. 

The vector of parameters m e in (9) encodes the x and y positions of N turbines 
as: 
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Runtime (s) 


Ratio 


Forward model 


30.41 




Forward model + adjoint model 


34.22 


1.125 



Table IV: Timings for the turbine position gradient calculation. The efficiency of the 
adjoint is that of the theoretical value of 1.125. 



The turbines are parameterised by increased friction located around the turbine cen- 
tres. The corresponding friction function ct(m) is defined as: 



N 



ct{m){x,y) = ^Kipp^^rix^py^riy), dD 

where if = 21 is a scaling factor, r = 40 m is the extent of the parameterised turbine, 
and V' is a smooth bump function with compact support defined as: 



ei-i/(i-ll '" 11") for ll ^'^'^l.'^"^ !! < 1, 
otherwise. 



The shallow water equations were discretised using the P2-P1 finite-element pair. 
For performance reasons, the function C((m) was implemented in Python instead of 
expressing it as part of the UFL formulation. Consequently, the dependency on m does 
not occur explicitly in the UFL form and hence dolfin-adjoint cannot automatically 
compute its derivative. However, dolfin-adjoint is able to automatically compute the 
derivative of J with respect to Cf, and so this problem can be circumvented by over- 
loading the ReducedFunctional class and manually implementing the final step of the 
chain rule: 

^ dJ dct 

VJ(m) = • 
dct dm 

The first term is automatically computed using dolfin-adjoint. The second term 
can easily be derived and implemented by hand by differentiating (11). Once this 
ReducedFunctional class was implemented, the optimisation framework could be used 
as usual. 

The example considered here optimises a deployment site near the Orkney Islands 
in Scotland, where 32 turbines are to be installed (figure 6a). A constant input flow with 
2 m/s speed is enforced on the left boundary, while the free-surface displacement on the 
right boundary is set to zero. A no-normal flow condition is applied on all remaining 
boundaries. The remaining parameters are H ^ 50 m, ly — 90 m^/s, g = 9.81 m/s^. 

The turbines are initially distributed in a structured manner as shown in figure 6c. 
The optimisation was performed using the SQP implementation of Kraft [1994] until 
the relative reduction of the functional of interest dropped below 10^®. Bound con- 
straints ensured that the turbines remained in the site area, which models the fact 
that site developers acquire a license for a particular site, and cannot deploy outside 
it. Furthermore, a set of inequality constraints were used to enforce a minimum dis- 
tance of 60 m between each turbine. 

The results are presented in figure 6. The optimisation algorithm terminated after 
81 iterations. The optimisation increased the total power output of the turbine farm by 
12%, from an initial value of 164 MW to 183 MW. 

Table IV compares the runtime of the forward model and the runtime of the gradient 
calculation. Both were performed in parallel with 4 CPUs. The ratio of forward and 
adjoint runtimes is close to the theoretical ideal, as expected. 
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6.3. Data assimilation with wetting and drying 

6.3.7. Introduction. Wetting and drying processes such as tsunami inundation or the 
flooding and receding of tides play an important role in the study of tsunamis [Kowalik 
et al. 2005], tidal flats and river estuaries [Zhang et al. 2009; Xue and Du 2010], and 
flooding events [Westerinli et al. 2008; Song et al. 2011]. Many algorithms have been 
proposed for the numerical simulation of wetting and drying processes, both for the 
shallow water equations (Medeiros and Hagen [2013] and the references therein) and 
for the Navier-Stokes equations [Funke et al. 2011]. 

In this example, we consider a data assimilation problem where the goal is to recon- 
struct the profile of an incoming tsunami from observations of the wet/dry inundation 
interface. The tsunami is modelled by the time-dependent nonlinear shallow water 
equations with the wetting and drying scheme proposed by Karna et al. [2011]. The 
resulting optimisation problem is: 

min J{ri) 

m.u,ri 

subject to ^ + {u-V)u + gVr]= ^^^^^ \\u\\u on x (0,r], 

dt^ H (12) 

— +V-(i/u)==0 onl7x(0,r], 
ot 

H = m on af^z3 X (0,T]. 

where c M-^ is the spatial domain, T is the final time and u : x (0,T] 
and r/ ; X (0,T] M are the unknown depth-averaged velocity and free-surface 
displacement, respectively. In the classical shallow water equations the total water 
depth is defined as H = r] + h, where h : fl ^ R is the static bathymetry; in order 
to account for wetting and drying, Karna et al. [2011] uses a modified total depth 
definition H = f{H) instead, where / is a smooth approximation of the maximum 
operator: 

f{H) = ^ +a2 + i?) « niax(0, H), 

and a > controls the accuracy of the approximation. The remaining parameters 
in (12) are the gravitational force g — 9.81 m/s^ and the friction coefficient in the 
Chezy-Manning formulation: 

CtiH) = — — — , 

where /i e M is the user specified Manning coefficient. 

The boundary conditions are as follows: on the inflow boundary dQo a Dirichlet 
boundary condition with value m : (0, T] M is applied, which also acts as the control 
parameter. For simplicity, it is assumed that m varies only in time, i.e. is constant along 
the boundary. On the remaining boundaries, a no-normal flow boundary condition is 
applied. 

The functional of interest J measures the misfit between the observed and the sim- 
ulated wet/dry interface. For its formulation, an indicator function is constructed that 
is 1 in dry areas and in wet areas. By noting that rj > h in wet areas and 77 < ft, in 
dry areas, this indicator function is defined as H^r] — h) where % denotes the following 
smooth approximation of the Heaviside step function: 

, If X \ fO ifx < 0, 
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(a) Bathymetry (b) Mesh 

Fig. 7: The laboratory setup of the Hokkaido-Nansei-Oki tsunami example, based on 
the 1:400 laboratory experiment of Matsuyama and Tanaka [2001]. The island at the 
center and the coast on the right are hit by a tsunami coming from the left boundary. 

where again a controls the smoothness of the approximation. With that, the functional 
of interest is defined as: 

where d : 17 x (0, T] M denotes the indicator function of the observed wet/dry inter- 
face. While such inverse problems are in general ill-posed and require regularisation, 
satisfactory numerical results were obtained in this example with no regularisation 
term. The implementation of such a regularisation term in the functional would be a 
trivial modification. 

6.3.2. Implementation. The modified shallow water equations in the optimisation prob- 
lem (12) are discretised with the LBB-stable P1dg-P2 finite element pair in space 
[Cotter et al. 2009]. A simple upwinding scheme is implemented, which is obtained 
by integrating the advection term by parts, replacing the advected velocity at the in- 
flow facets with the upwind velocity and then integrating by parts again. Following 
Karna et al. [2011], the resulting equations are then discretised with a second-order 
Diagonally Implicit Runge-Kutta (DIRK22) scheme in time [Ascher et al. 1997, §2.6]. 

The implementation of problem (12) in the presented optimisation framework was 
straightforward: the control parameters appear directly in the UFL representation of 
the governing equations, and hence the framework was applicable without any modi- 
fications. 

6.3.3. Reconstruction of the Hokkaido-Nansei-Oki tsunami profile. This example is motivated 
by the question of whether it is possible to reconstruct a tsunami profile from satellite 
observations that record the inundation interface on the coast over time. 

The considered event is the Hokkaido-Nansei-Oki tsunami that occurred in 1993 
and produced run-up heights of up to 30 m on Okushiri island, Japan. The Central 
Research Institute for Electric Power Industry (CRIEPI) in Abiko, Japan constructed 
a 1:400 laboratory scale model of the area around the island [Matsuyama and Tanaka 
2001]. Following the setup used in Yalciner et al. [2011], we consider a rectangular 
domain of size 5.448 m x 3.402 m, with the bathymetry and coastal topography shown in 
figure 7a. It contains an island in the center and coastal regions on the top right of the 
domain. The left boundary is the inflow boundary dflo, on which a surface elevation 
profile is enforced that resembles a tsunami (figure 8a). The task is to reconstruct this 
wave profile. 

The domain is discretised with an unstructured mesh consisting of 1, 411 triangular 
elements with increasing resolution near the inundation areas (figure 7b). The tem- 
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Runtime (s) 


Ratio 


Forward model 


2026 




Forward model + adjoint model 


3146 


1.55 



Table V: Timings for the tsunami reconstruction gradient calculation. The efficiency of 
the adjoint approaches the theoretical ideal value of 1.5. 

poral discretisation uses a timestep of 0.5 s with a total simulation time of 32 s. The 
Manning coefficient was set to fi = 0.05 s/ma and a smoothness value of a = 0.1 was 
used. 

The observations d are synthetically generated by running the forward model with 
the wave profile that was used in the laboratory experiment while recording the 
wet/dry interface. This approach of generating the synthetic observation data with the 
same model that is used for the assimilation is referred to as an "inverse crime" [Kaipio 
and Somersalo 2004], which renders the optimisation problem less ill-posed than it ac- 
tually is. However, the main purpose of this example is to demonstrate the capabilities 
of the optimisation framework, and hence this approach is adopted for simplicity. 

The optimisation algorithm begins with an initial guess for the Dirichlet boundary 
values of 0.105 cm for all timelevels, which corresponds to the final free-surface dis- 
placement of the input wave. The tsunami signal at the boundary is applied 2 s after 
the simulation start time. The boundary condition during the final 2 s has no impact on 
the functional, as the wave does not affect the wet/dry interface before the simulation 
ends. Therefore, the boundary values at the start and end were reset to the correct 
Dirichlet boundary values and excluded from the optimisation. Furthermore, a box 
constraint was used to restrict the minimum and maximum free-surface displacement 
between —1.5 cm and +2 cm; without these constraints the first optimisation iterations 
generate unrealistically large Dirichlet boundary values for which the forward model 
does not converge. 

Figure 8 shows the results of the problem solved with the limited memory BFGS 
(L-BFGS-B) implementation in SciPy [Jones et al. 2001]. After 103 optimisation itera- 
tions (113 functional evaluations) the relative decrease of the functional of interest fell 
below machine precision and the algorithm terminated. The incoming wave was recon- 
structed to within an absolute error of 3.91 x 10~^ cm (figure 8c), which corresponds to 
a relative error of less than 3 x 10^^% of the incoming wave height. 

Table V lists the runtimes of the forward model and the gradient calculation. The 
nonlinear equations of the forward model are typically solved with 2 Newton itera- 
tions, which suggests an optimal runtime ratio of 1.5. The measured value is close to 
this theoretical value and confirms the relative efficiency of the adjoint model imple- 
mentation. 

7. SUMMARY 

In this paper we present a new framework for rapidly defining and solving PDE- 
constrained optimisation problems. The framework exploits the FEniCS system, 
dolfin-adjoint, and established optimisation algorithms to allow the user to specify the 
discretised optimisation problem in a high-level language that resembles the mathe- 
matical notation. 

The core idea of the implementation is to perform all required tasks of the optimi- 
sation process using the tape of the forward model that is recorded by dolfin-adjoint 
(analogous to the AD concept of a tape). This includes the evaluation of the objective 
functional by replaying the forward model, computing the functional gradient by de- 
riving and solving the adjoint problem, and modifying the tape in order to incorporate 
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(d) The objective functional J 



Fig. 8: Results of the reconstruction of the Hokkaido-Nansei-Oki tsunami profile. The 
initial and final 2 s of the boundary values are excluded from the reconstruction. 



parameter updates. While this paper applies the idea to the dolfin-adjoint system, the 
same approach is applicable to the operator-overloading class of AD tools that build a 
tape at runtime. 

As demonstrated, this approach reduces the required user input to a minimum: once 
the forward model has been implemented, only a handful of lines of code are required 
to specify the optimisation problem. It applies naturally to both linear and nonlinear as 
well as to both steady and time-dependent governing PDEs. Furthermore, the user has 
the choice of a variety of gradient-free and gradient-based methods. General equality 
and inequality control constraints can be applied. 

In this paper, the reduced formulation is used for solving the optimisation prob- 
lem. For cases where quasi-Newton methods applied to the reduced formulation are 
insufficient, the framework provides all ingredients necessary for more sophisticated 
approaches. Therefore, this framework is also of interest for the development of such 
advanced optimisation algorithms: by implementing an algorithm in the framework 
once, it is immediately applicable to optimisation problems across science and engi- 
neering. Future work includes the automation of shape optimisation, the development 
of the oneshot approach, multigrid optimisation techniques, and the exploitation of 
reduced-order modelling in optimisation. 
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